Integrated Survey Data

Overview and conditions of access

Pierre Walthéry & Vanessa Higgins

UK Data Service

October2025

Plan of the presentation

  1. Most common surveys with integrated data
  2. Typical data integrated with surveys
  3. Accessing secure integrated datasets

Introduction

Integrated data:

  • When we add non survey data to survey data

    • Whether part of the original data collection or not
    • Whether primary or secondary
    • Whether same unit of analysis or not
    • Validation or enhancement (Benzeval et al 2020)
  • Typically: administrative, biometric, social media data

  • Examples: accelerometer, genetic data, individual NHS or PAYE records…

  • This talk mostly deals with integrated data available at the UK Data Service

Part 1

Section title with icons: What are the surveys with integrated data?

Overview

  • Depends on:

    • The kind of data linked to the surveys ie mainstream topic, availability…
    • The survey itself (i.e. does it include the required linking information / user consent)
    • … Scope of the surveys ie is linkage part of the original data collection , was it intended or is it a post hoc project?
    • For a variety of reasons - more straightforward
  • Major longitudinal studies:

    • Birth Cohort studies
    • Next Steps and ELSA
    • Understanding Society
  • A few large scale cross-sectional surveys such as:

    • ASHE (Annual Survey of Hours and Earnings)
    • Health Survey for England
    • Family Resources Survey

Birth cohort studies

  • Follow a sample of individuals* over their whole life
  • Born on a specific week of 1958(NCDS), 1970(BCS), 1989-90 (Next Steps), 2000 (MCS), 2026 (?)
  • MCS
    • ~ 19,000 children originally (between June 2001 and Jan 2003)
    • 7 ‘sweeps’ 9 months then at 3, 5, 7, 11, 14, years old
    • parent and child interviews
    • Focuses on education, skills and health, truancy, cognitive ability biological measurements in additional to traditional socio-economic data

Understanding Society

  • The largest longitudinal study representative of the UK population

  • Initial sample size: 40K households, 100K individuals

  • 14 waves so far: 2009-23; includes BHPS data 1991-02

  • Ethnic minority boost samples; Innovation Panel

  • Very wide range of topics covered:

    • Employment, income, benefits, savings, debt, and assets
    • Health, well-being, and health behaviours
    • Housing, housing costs, and dwelling characteristics
    • Family, partnerships, caring responsibilities,
    • Education, training
    • Expenditure, consumption, deprivation
    • Social attitudes, values, political opinions
    • Transport, mobility, and commuting patterns
    • Environmental behaviours, and related attitudes

Part 2

Section title with icons: What kind of non-survey data is  integrated   with UKDS surveys?

Overview

  • Administrative records

    • ie data collected by a public ie the state controlled authority: government department, the NHS
    • Health: NHS SHS: medical records ie in/outpatient attendance hospital episodes, maternity
    • Education: National Pupil Database, school profile/teacher survey; student loan data, OFSTED data
    • Pollution; green space deciles; PAYE data
  • Non survey measurement: energy, health, behavioural

  • Social media/Digital trace

What is on offer: examples

1. Genetic risk data

  • Polygenic scores (PGI) about health and social outcomes

  • Gene combinations associated with probability of certain outcomes

    • 45 traits: ie health outcomes and behaviour; mental health and personality traits; Social outcomes

    • Available on the Birth Cohorts and Next Steps datasets

    • Subsamples limited to ‘Europeans’ from a genetic perspective

2. Hospital episodes data

  • NHS data about all hospital admissions in England.
  • Four datasets:
    • Episodes of using: Accident and Emergency ; Admitted Patient Care; Adult Critical Care; Outpatients
    • Mostly available for 2007/9-2023
  • Data on diagnosis, maternity, mortality, mental health, treatment’s length, deprivation etc.
  • Available for the NCDS Birth Cohort

3. School inspection data

  • OFSTED ‘State of the nation’: anonymised data on latest schools inspections outcomes of 22,000 open schools

  • Linked with the MCS, currently covers years 2005 to 2019

  • Data on a wide range of topics. such as:

    • Effectiveness of leadership and management
    • Pupils’ achievement (aggregated) (2005-2015)
    • Behaviour and safety of pupils (2005-2015)

4. NEST pension data

  • Main employer pensions scheme for UK employees

  • Covers 1,000,000 employers, 11 millions employees

  • Linked to consenting Understanding Wave 11 respondents (about 12,000)

  • Data about:

    • Employer and employee characteristics
    • Current pension status
    • Pension contributions characteristics

5. Studies deposited on ReShare

Part 3

Section title with icons: accessing secure integrated dataset

Secure datasets at the UK Data Service

  • A couple of slides on this (Waiting for Essex )

Additional resources